Genome assembly of 3 Amazonian Morpho butterfly species reveals Z-chromosome rearrangements between closely related species living in sympatry

Abstract The genomic processes enabling speciation and species coexistence in sympatry are still largely unknown. Here we describe the whole-genome sequencing and assembly of 3 closely related species from the butterfly genus Morpho: Morpho achilles (Linnaeus, 1758), Morpho helenor (Cramer, 1776), and Morpho deidamia (Höbner, 1819). These large blue butterflies are emblematic species of the Amazonian rainforest. They live in sympatry in a wide range of their geographical distribution and display parallel diversification of dorsal wing color pattern, suggesting local mimicry. By sequencing, assembling, and annotating their genomes, we aim at uncovering prezygotic barriers preventing gene flow between these sympatric species. We found a genome size of 480 Mb for the 3 species and a chromosomal number ranging from 2n = 54 for M. deidamia to 2n = 56 for M. achilles and M. helenor. We also detected inversions on the sex chromosome Z that were differentially fixed between species, suggesting that chromosomal rearrangements may contribute to their reproductive isolation. The annotation of their genomes allowed us to recover in each species at least 12,000 protein-coding genes and to discover duplications of genes potentially involved in prezygotic isolation like genes controlling color discrimination (L-opsin). Altogether, the assembly and the annotation of these 3 new reference genomes open new research avenues into the genomic architecture of speciation and reinforcement in sympatry, establishing Morpho butterflies as a new eco-evolutionary model.

Order of Authors Secondary Information: Response to Reviewers: All required changes were made.

Introduction
Chromosomal rearrangements are likely to play a major role in both adaptation and speciation processes (1,2). Inversions, for instance, can favour the emergence of adaptive syndromes by locking together co-adapted allelic variations [3]. Chromosomal rear-5 rangements have also been suggested to contribute to reproductive isolation between species by promoting divergent adaptation or by bringing together genetic incompatibilities [4]. Nevertheless, the role of structural variants in these evolutionary processes is still largely unknown. Recently-developed sequencing and assem- 10 bly methods now provide access to complete genomes, therefore opening the investigation of structural variation within and among species (see [5] for a review).
three closely-related Morpho species living in sympatry for a large range of their geographical distribution ( Fig. 1) : M. helenor, M. achilles and M. deidamia [6], thereby developing relevant resources to study the evolution of barriers to gene flow in sympatry. In Lepidoptera, specialization towards host-plant has been shown to be a 20 major factor affecting species diversification [7]. Such ecological specialization may favour speciation and co-existence in sympatry, and may stem from the evolution of gustatory receptors enabling plant recognition by females [8].
The evolution of visual [9] and olfactory signals [10] between 25 species may also limit gene flow between sympatric species of Lepidoptera. In the three Morpho species studied here, both males and females display conspicuous iridescent blue colour patterns on the dorsal side of their wings, combined with cryptic brownish colour on the ventral side [11]. Such a combination of dorso-ventral 30 pattern, associated with a fast and erratic flight, is thought to contribute to the high escape abilities from predators of these butterflies, promoting colour pattern convergence between sympatric species (i.e. escape mimicry, [12]). Parallel geographic variation of dorsal wing colour pattern has indeed been detected in the three 35 Morpho species studied here, suggesting local convergence promoted by predators behaviour [13]. Given the key role of colour pattern in both sexual selection and species recognition in diurnal butterflies, such a resemblance is thought to enhance reproductive interference between sympatric species [14]. Behavioural experi-40 ments carried out in the wild revealed that males from the three mimetic Morpho species are indeed attracted by both intra and interspecific wing patterns [15]. Despite this heterospecific attraction of males at long distances, RAD-sequencing markers revealed a highly limited gene flow between these three sympatric species 45 [15]. This might be due to the differences in the timing of daily activities observed between these sympatric species limiting heterospecific encountering [15]. This divergence in daily phenology may contribute to the initiation of speciation or to the reinforcement of pre-zygotic barriers to heterospecific matings. 50 Genetic incompatibilities may also contribute to speciation and reinforcement processes by generating post-zygotic barriers. For instance, variation in chromosome numbers has been shown to correlate with speciation rate in Lepidoptera [16]. Similarly, chromosomal inversions may fuel the speciation process: by capturing 55 genetic variations, inversions may lead to increased genetic divergence between species. Such divergence may lead to maladaption in hybrids and further limit gene flow between species living in sympatry.
By relying on both karyotype data and PacBio-Hifi sequenc- 60 ing, we generated de novo genome assemblies for three sympatric species of Morpho butterflies. The divergence between the two sister species M. helenor and M. achilles was estimated to occur about 3.91 My ago, while the divergence between these two sister species and M. deidamia was estimated to circa 16.68 My ago [17], enabling to compare the genome divergence in sympatry at different time scales. We then investigated the structural variants and variation in genes potentially contributing to pre-zygotic isolation among these species. We aim to shed light on the genomic processes involved in sympatric speciation and reinforcement as 70 well as detecting chromosomal rearrangements. We also provide their mitochondrial genomes, study their transposable element (TE) contents and annotate the genomes. These genomic resources will open new research avenues into the understanding of adaptive processes, such as convergence evolution of colour pattern or di-75 vergence in visual systems, as well as speciation and co-existence of sister-species in sympatry, establishing Morpho butterflies as a new eco-evolutionary model.

Material and methods
Butterfly sampling 80 Males from the species M. helenor (n = 1), M. achilles (n = 4) and M. deidamia (n = 2) were caught with a handnet at the Patawa waterfall, located in the Kaw mountain area of French Guiana (GPS location: 4.54322; -52.15832) to perform DNA extractions. In these species, males typically patrol in river beds and are easy to catch, 85 while females are more rarely encountered. We therefore focused on males only. Because in butterflies sex is controlled by a ZW sex chromosome system (females being the heterogametic sex), we were thus able to access the Z sex chromosome but not the W chromosome.

Karyotype study
Cytogenetic techniques were applied to wild caught males (M. helenor (n = 3), M. achilles (n = 4) and M. deidamia (n = 2)) that were collected at the above-mentioned location in 2019. Their testicles were dissected and processed shortly after capture following the 95 First et al. | 3 protocol described in [18]. The obtained cell suspension was conserved in fixative at about 4°C. The cell spreading and staining were then performed as described in [18]. The chromosome staining relied on the Giemsa method.

DNA extractions and genome sequencing
Live butterflies (M. helenor (n = 1), M. achilles (n = 4) and M. deidamia (n = 2)) captured in 2021 at the same site in French Guiana were killed in the lab and their body immediately placed in liquid nitrogen. The DNA extraction was carried out the following day using the Qiagen Genomic-tip 100/G kit and following supplier in-105 structions. The extracted DNA of a single male from each species was used (see Supplementary Fig. 1 for pictures of the wings of the sequenced specimens). Library preparation and sequencing were performed at GeT-PlaGe core facility (INRAe Toulouse) according to the manufacturer's instructions "Procedure and Checklist Prepar-110 ing HiFi SMRTbell Libraries using SMRTbell Express Template Prep Kit 2.0". At each step, DNA was quantified using the Qubit dsDNA HS Assay Kit (Life Technologies). DNA purity was tested using the nanodrop (Thermofisher) and size distribution and degradation assessed using the Femto pulse Genomic DNA 165 kb Kit (Agilent).

115
Purification steps were performed using AMPure PB beads (Pacific Biosciences). 15µg of DNA was purified then sheared at 15kb (speed 31 and 32) with the Megaruptor3 system (Diagenode). Using SM-RTbell Express Template prep kit 2.0, a Single strand overhangs removal, a DNA and END damage repair step were performed on 120 10µg of sample. Blunt hairpin adapters were then ligated to the library, which was treated with an exonuclease cocktail to digest unligated DNA fragments. A size selection step using a 10kb cutoff was performed on the BluePippin Size Selection system (Sage Science) with the "0.75 percent DF Marker S1 6-10 kb vs3 Improved 125 Recovery" protocol. Using Binding kit 2.2 and sequencing kit 2.0, the primer V5 annealed and polymerase 2.2 bounded library was sequenced by diffusion loading onto 1 SMRTcells per sample on SequelII instrument at 80 pM with a 2 hours pre-extension and a 30 hours movie.

K-mer analysis, genome size and heterozygosity estimation
We used Jellyfish (v.2.3.0) [19] to perform a k-mer analysis on each PacBio dataset with a k-mer size of 21. For each dataset k-mers were counted and aggregated (jellyfish count option) and histograms 135 were generated using the "-histo" command. The resulting histograms allowed the estimation of genome length and heterozygosity with GenomeScope version 2.0 [20] using the web application.

Nuclear and mitochondrial genome assembly
For the assembly of the nuclear genomes, we compared three long-140 read assembly tools: IPA-Improved Phased Assembler (v1.0.3-0) (https://github.com/PacificBiosciences/pbipa), Flye (v2.9) [21] and Hifiasm (v0.16.1 with the option -l3 to purge all types of haplotigs in the most aggressive way) [22]. For each assembler, we estimated basic assembly statistics such as scaffold count, contig count 145 and N50 using the "stats.sh" program from the BBMap v38.93 package [23]. The completeness of each assembly was assessed using BUSCO v5.2.2 and MetaEuk for gene prediction against the lepidoptera_odb10 database [24]. We retained the Hifiasm assembly because it had the highest BUSCO score, the highest contiguity 150 (N50) and longest contig. Despite the high level of purging performed by Hifiasm, the species (M. helenor and M. achilles respectively) retained a high level of duplicates in the BUSCO score. To remove false haplotypic duplications in these two species, we used Purge_dups v1.2.5 setting the cutoffs manually (with calcuts -l 5 155 -m 33 -u 135 for M. helenor and calcuts -l 10 -m 45 -u 145 for M. achilles) [25]. The completeness of the purged genomes was then reassessed using BUSCO. The mitochondrial genome of each species was assembled and circularized using Rebaler (https://github.com/rrwick/Rebaler) directly 160 from the PacBio Hifi reads and using the mitochondrial genome of the closely related species Pararge aegeria as a reference.

Annotation of repetitive regions
The annotation of repetitive regions in the three species was performed following two main steps. First, we used RepeatModeler 165 v2.0.2a [26] with the option -s (slow search) and -a (to get a .align output file) to create de novo libraries of repetitive elements for each species. The library was then used to hardmask the corresponding genome assembly using RepeatMasker 4.1.2.p1 [26] . A summary of the repeated elements was generated with the script 170 'buildSummary.pl' included in RepeatMasker.

Genome annotation
Each of the three genomes was independently annotated using Maker v2.31.10 [27], following the protocol given in [28]. In short, Maker is usually run several times successively and uses the gene 175 models generated in one round to train ab initio gene-predictors and improve the initial gene models in the next round (see below). We used the above-mentioned hardmasked genomes and carried out their annotation using the proteomes of three closely-related species, namely Pararge aegeria [29], Maniola hyperantus [30] and 180 Bicyclus anynana [31]. For each species, the output files were merged into a gff3 file that was then used to generate the necessary files to train SNAP (version 2006-07-28), an ab initio gene finding program [32]. A second run of Maker with the above-mentioned gff3 file and the .hmm file provided by SNAP resulted in a second gff3 file 185 that was used to train SNAP a second time. A third round of Maker with the second gff3 and .hmm files was followed by the training of Augustus (3.3.3), another gene prediction tool [33], with the third gff3 file. A final round of Maker with the third gff3 file and the files generated by Augustus led to the fourth and last gff3 file, containing 190 all the genome features for each species.
Protein-Protein BLAST 2.9.0+ (-evalue 1e-6 -max_hsps 1 -max_target_seqs 1) was then used to assess putative protein functions in each Morpho species by comparing the protein sequences given by Maker to the protein sequences from the annotated 195 genomes of Maniola jurtina [29], P. aegeria [29], B. anynana [31] and S. littoralis specifically for the detection of OR sequences [34]. We used BUSCO to assess the completeness of the proteome with the protein mode and the lepidoptera_odb10 database on the annotated gene set produced by MAKER [24].

Phylogenetic analysis
To specifically compare the exon sequences of the opsins detected in the Morpho genomes to the opsins described in other Lepidoptera, we retrieved the coding sequences of opsins from NCBI and used the software Mega v. 11

Synteny and rearrangement detection
To assess variation in chromosome-scale synteny, we compared the assemblies of each Morpho to the assembly of M. jurtina, the closest relative of Morpho with a karyotype of 29 chromosomes and for which a high quality chromosome-level assembly (based on N50 225 values and Busco score, accession ID GCF_905333055.1) is available [29]. We used MUMmer 3.23 [41] to align the masked assembled genomes of M. helenor, M. achilles and M. deidamia to the M. jurtina genome. The output produced by MUMmer is an ASCII delta file that was then filtered and parsed using the utility programs delta-filter 230 and show-coords from MUMmer. Synteny was visualized with the MUMmer results in R with the packages circlize v 0.4.12 [42] and Paletteer (https://github.com/EmilHvitfeldt/paletteer) using the Rscript from [43] described here: https: //github.com/bioinfowheat/Polygonia_calbum_genomics/ 235 blob/7c75aac624157faa3ab229e3fc1e0e315302194d/synteny/ circlePlot_nucmerOutput.R, removing short contigs, short alignments (less than 200bp) and low identity alignments (less than 90% identity).
In order to detect potential genome rearrangements between 240 Morpho and closely-related species, we estimated the wholegenome collinearity between the Morpho assemblies and five closely-related Nymphalidae species whose genomes exhibit a good-quality assemblies in the NCBI genome database: and Lasiommata megera (GCA_928268935.1) using D-GENIES [44]. Paired alignments between a Morpho species and one Nymphalidae species were performed using the minimap2 aligner [45] in D-GENIES, treating each Morpho species genome as the query and 250 the Nymphalidae species genome as the target reference. We also used D-GENIES to pair-compare the genomes of the three Morpho species. As D-GENIES revealed differences between Morpho species in the contig corresponding to the Z chromosome (see results), we used SyRI [46] to study in detail the rearrangements in 255 the sequences of this contig between the three species. We generated paired alignments of the Z contig with minimap2 and ran SyRI with the option -c on .sam files. SyRI requires that the two compared genomes represent the same strand and in the case of M. achilles, the orientation of the sequence produced by HiFiasm was 260 the complementary to the sequences of M. helenor and M. deidamia. We then reverse-complemented this sequence in order to make the alignments. All the genomic structures predicted by SyRI were plotted using plotsr [47].

Comparing karyotypes between species
First, we characterized the karyotypes of the three studied species (see Supplementary Fig. 2 to visualize the chromosomes). In M. helenor, the detected number of diploid chromosomes ranged from 54 to 56 in the different replicates of mitoses, with a discreet mode at  Surprisingly, the karyotype of the last male was quite different, with a modal number of 84 mitotic chromosomes. Interestingly, there was the same number (n = 28) of elements as above at the pachynema stage, indicating that they were trivalents. They were 280 thicker than bivalents and a more careful analysis showed the recurrent asynapsis of one of the 3 chromosomes ( Supplementary Fig. 3).
No "normal" metaphase I or II was observed. It was concluded that this specimen was triploid with 3n = 84, and probably sterile. In M. deidamia, the diploid chromosome number had a discreet mode of 285 2n = 54, suggesting a slightly smaller number of chromosome pairs (n = 27) in this more distantly-related species. Our result are consistent with the modal number of chromosomes in the Morphinae (n = 28) described in previous karyotypic studies conducted in 8 Morpho species [48], where the reported number of chromosomes 290 was also n = 28 for both M. helenor and M. achilles.
GenomeScope analyses suggested very high levels of heterozygosity for the three species (Table 1). In all of them, the N50 and contig sizes were generally larger in the assemblies produced by Hifiasm than in IPA and Flye assemblies (see supplementary table   295 1). The BUSCO scores revealed a very high percentage of duplicated sequences, especially in the assemblies produced by IPA and Flye. The use of purge_dups strongly reduced the number of duplicates, the estimated size of the genome and the number of final contigs (see Supplementary Fig. 4 and Sup.

Annotation of repetitive region
In each of the three species of Morpho, we annotated around 50% of the genome as repeated elements ( Supplementary Fig. 5) Supplementary  Fig. 5. For the three species, long interspersed nuclear elements 315 (LINE's) accounted for the largest percentage (between 13.53% and 17.22% ) of the repeated elements in the genomes.

Genome annotation
We recovered 12,651, 12,978 and 12,093 protein-coding genes in the genomes of M. helenor, M. achilles and M. deidamia respectively.  Fig. 5). We were thus able to further investigate gene families that could be involved in pre-zygotic isolation through duplication or loss events. This includes genes having a role in vision (L-opsin) but also chemosensory genes such as odorant and gustatory receptors that reflect the degree of species specialization.

Duplications in opsins genes
Vision in butterflies notably relies on opsins, for which three major types of molecules have been described depending on their wavelength of peak absorbance: in the ultraviolet (UV, 300-400 nm), blue (B, 400-500 nm) and long wavelength (L, 500-600 nm) part 345 of the visible spectrum. Opsins are encoded by UV, B and L opsin genes. We investigated the number of copies for each opsin gene in the three Morpho species. We consistently found one copy of the UV opsin gene and two copies of the B opsin genes in the three Morpho species. Duplications of the L-opsin were observed in M. achilles, 350 M. deidamia and M. helenor. In the other reference genomes M. jurtina, B. anynana and P. aegeria, a single copy of the UV opsin gene, the B opsin gene and the L opsin gene were found. By comparing the L-opsin sequences using a maximum likelihood tree based on the exon sequences (Fig. 2), we showed that the duplications ob-355 served in Morpho butterflies probably occurred independently from previously described duplications that happened in other clades of Lepidoptera. The phylogenetic relationships between the copies in the three species reveal that the duplications observed in the three Morpho species probably occurred before their speciation (Fig.   360 2). The detection of the different copies in different species within the Morpho genus and in closely-related genus is now required to precisely characterize the evolutionary origin of these duplications.

Odorant and gustatory receptors
In order to estimate the number of OR and GR genes in the three terpenes and aliphatics) as described in [34], showed that the loss of ORs in Morpho were not clustered around a particular set of genes ( Supplementary Fig. 9). Further functional characterization coupled with precise ecological investigations are therefore needed to understand the loss of ORs in the Morpho genus.

Conserved synteny with other Lepidoptera species
We found a high concordance between the n = 29 chromosomes of M. jurtina and the contigs of the three Morpho species (Fig. 3). The MUMmer alignment and the post alignment treatment to remove 395 short contigs and low identity alignments reduced the assembly to 27 contigs containing 97% of the total genome for M. helenor (removing 117 short contigs from the original assembly), 29 contigs (98% of the genome) for M. achilles (3 contigs removed) and 27 for M. deidamia (31 contigs removed) (Fig. 3). 400 The synteny plot between M. helenor and M. jurtina showed 27 contigs for M. helenor, one contig less than expected based on its karyotype of n=28. In the plot, one single contig (ptg000028l) was assigned to two different chromosomes from the M. jurtina assem-   We also found a high level of colinearity between the genomes of the three Morpho species and the five Nymphalidae species used for comparisons. The alignment between M. jurtina and the three Mor-430 pho species (Fig. 3) was very similar to the alignments obtained for the other Nymphalidae (Supplementary Fig. 6) and confirmed that the assembly of the genome of M. helenor by hifiasm might have merged together two chromsomes: the single contig ptg000028l was scattered into two chromosomes in the other Nymphalidae.

435
Although collinearity was generally high, we detected some putative inversions located in regions that varied among pairs for the three Morpho species in comparison with the Nymphalidae (see Supplementary Fig. 6). Interestingly, the contig corresponding to the chromosome Z was the only one consistently showing inver-440 sions in the pairwise genome-wide alignments (see Supplementary   Fig. 6).

Inversions in the Z-chromosome between the three sympatric Morpho species
Pairwise whole genome alignments of the three Morpho species 445 showed a very high similarity between genomes (see Supplementary Fig. 7). The only contig that differed between species was the one corresponding to the Z chromosome. SyRI identified one inversion of 1.6 Mb between M. helenor and M. deidamia, five inversions (comprising one of more than 1.8 Mb) between M. helenor and M. 450 achilles and two between M. deidamia and M. achilles with one of 1.6 Mb (Fig. 4). Interestingly, the inversion found in M. deidamia when compared to M. achilles or M. helenor has the same size and is located in exactly the same position of the chromosome (from bp 1567583 to 3192401), suggesting that this inversion is ancestral to 455 the speciation of M. achilles and M. helenor. In the case of M. achilles vs. M. helenor two inversions were found flanking the site of the putative ancient inversion and a bigger inversion was found at the end of the chromosome (Fig. 4).  chromosome. This suggests a high conservation of chromosomal synteny among closely-related Nymphalidae species, which is consistent with the high level of synteny observed throughout the whole Lepidoptera clade [52]. In the three species, genome heterozygosity was very high (from 1.68% in M. deidamia to 3.35% in 490 Morpho helenor) and heterozygosity presents a major challenge in de novo assembly of diploid genomes. Indeed, levels of heterozygosity of 1% or above are considered "moderate to high" and most assemblers struggle when two divergent haplotypes are sequenced together, as heterozygosity may impair the distinction of different 495 alleles at the same locus from paralogs at different loci [53]. Then, fi-nal assemblies of heterozygous genomes are expected to be of poorquality, highly fragmented and containing redundant contigs [54]. Hifiasm generated the most completely haplotype-resolved assemblies, nevertheless the level of heterozygosity clearly impacted the 500 quality of the assemblies and a post treatment to remove duplicated sequences was necessary for the two most heterozygous genomes (M. helenor and M. achilles), showing the difficulty that heterozygosity still imposes to long-read heterozygosity-aware assemblers. Such a high heterozygosity has been observed in other genomes 505 of Lepidoptera [31] and can be a signature of high effective population sizes. The wide Amazonian distribution of these species, and their flight activity could contribute to such high level of genetic diversity within population, because elevated dispersal contribute to increase gene flow within each species throughout their geographic 510 range. Our results also showed that around 50% of the genomes of the sequenced Morpho was composed of repeated elements, a very high proportion as compared to other genomes of Lepidoptera. In Lepidoptera, TE content has been found to be correlated with genome size [55], but in the case of the three Morpho species stud-515 ied here, the repeat content is higher than for other species with similar genome sizes such as the Bombyx mori moth, with a genome size estimated at 530 Mb and a TE content of 35% [56] or the more closely-related species Bicyclus anynana with a genome size of 475 Mb and a repeat content of 26% [31].

Structural variations between genomes of sympatric species
The karyotype and assembly analyses suggest some differences in chromosome number between the three sympatric Morpho species studied here, particularly between M. deidamia ( and speciation. First, hybrid-sterility models suggest reduced fertility or viability in individuals heterozygous for chromosomal rearrangements. These models are considered to be inconsistent and difficult to evaluate [4]. More recently, suppressed-recombination models propose that chromosomal rearrangements permit spe-535 ciation in sympatry because they reduce recombination between chromosomes carrying different rearrangements [4]. Indeed, in Lepidoptera, differences in chromosome number are proposed to be an important mechanism leading to species diversification in Agrodiaetus, Erebia and Lysandra butterflies (57,58,59).

540
Besides differences in chromosome numbers, we systematically found inversions in the contig corresponding to the Z chromosome when comparing the genomes of Morpho to the other Nymphalidae and between the three different Morpho species. Inversions are also a type of chromosomal rearrangement known to occur throughout 545 evolution and are considered an important mechanism for speciation particularly for species living in sympatry (1,4). Empirically and theoretically, it has been suggested that inversions may have contributed to speciation in sympatry in different groups of animals. In two ascidians species of the genus Ciona and in insects 550 like Drosophila inversions may promote speciation by reduction of the fitness or by causing sterility of heterozygotes (60,61). In the Anopheles gambiae species complex, inversions may allow for ecotypic differentiation and niche partitioning leading to different sympatric and genetically isolated populations [62]. In groups like 555 paserine birds where sexual differentiation is controlled by a ZW sex chromosome system (females being the heterogametic sex), inversions in the Z chromosome in particular seem to explain speciation in sympatry between close species. Cytological data show that across the Passeriformes, the Z chromosome has accumulated 560 more inversions than any other autosome and that the inversion 8 | GigaScience, 2022, Vol. 00, No. 0 fixation rate on the Z chromosome is 1.4 times greater than the average autosome. Interestingly, inversions on the Z chromosome are significantly more common in sympatric than in allopatric closely related clades (63,64).
In Lepidoptera, the role of inversions in speciation in sympatry has been studied in the species Heliconius melpomene and H. cydno, two sympatric species that can hybridize (although rarely) in the wild [65]. The analyses of the genomic differences between the two species showed some small inversions (less than 50 kb) and 570 there was no evidence for a reduction of recombination in hybrids, suggesting that in this case, inversions were not involved in the maintenance of the species barriers and other processes such as strong mate preference could prevent hybridisation in the wild [65]. In the Morpho species studied here however, we found inversions be-575 tween Morpho Z chromosomes that were longer than 1.5 Mb. Models suggest that to be associated with adaptive traits or species barriers, inversions should typically be megabases long in order to be fixed in populations [65]. tionary forces could be acting to maintain them. The copy number variation detected in genes involved in colour perception (i.e. Lopsin) may also play a significant role in reproductive isolation in these sympatric species. For instance, the three copies of L opsins found in the Papilio genus (Fig. 2) have been found to also show 595 subfunctionalization and neofunctionalization [66]. The duplication followed by genetic divergence observed in these three mimetic Morpho species may improve their visual discrimination capacities, and facilitate species recognition, therefore reinforcing barriers to gene flow in sympatry. Genes potentially involved in colour pattern 600 variations (e.g. bricabrac or bab) may also play a role in prezygotic isolation but they were not thoroughly investigated here as their functional evolution involves changes in regulatory sequences rather than events of duplication or gene loss [67]. Interestingly, an orthologous search of the putative proteic sequences of each Morpho 605 species against those of M. jurtina allowed us to uncover different copy numbers of the gene bricabrac, which play a significant role in differences of UV iridescence between males of two incipient species of sulphur butterflies [68]. The copy responsible for the presence/absence of UV iridescence is located on the Z chromosome and in the three Morpho species, we found one or more copies of bricabrac on the contigs that correspond to the Z chromosome: M. deidamia had one copy of bricabrac, while M. helenor and M. achilles displayed two copies of this gene. It seems however that the second copy in M. helenor and M. achilles correspond to truncated 615 copies of bricabrac. While this is certainly the sign of an ancient duplication followed by a pseudogenization event, this could lead to further investigations of putative functions of the truncated copies. It is worth noting that variations in the number of bab copies was also observed in the three reference genomes used for the blast: M. 620 jurtina had two copies on the Z chromosome (including a truncated copy), B. anynana had only one and P. aegeria had none. The investigation of gene levels of polymorphism on the Z chromosomes would also be of great interest as genes linked to the Z chromosome are often among the most divergent between closely-related species 625 [69].
Altogether, the assembly and annotation of these three mimetic species of Morpho butterflies reveal differences in chromosome numbers, the presence of several Mb-long inversions in the Z chro-mosome, as well as copy number variation and genetic divergence 630 among copies of genes that may play a significant role in reproductive isolation. Our study thus open new avenues into the investigation of the ecological and genomic factors involved in sympatric speciation and its reinforcement. 635 Fastq files, genome assemblies, assembly methods and collection information were uploaded at the ENA web site (https://www.ebi.ac.uk/ena/browser/home) under the project number PRJEB56642. Genome accession numbers are ERZ14213098 for Morpho helenor, ERZ14213099 for M. achilles and ERZ14213100 640 for M. deidamia. Other data further supporting this work are openly available in the GigaScience repository, GigaDB (70,71,72).

Ethical Approval
All butterflies used in this study were sampled in French Guiana and used by researchers from French Institutions. Following the recom-645 mendation of the Nagoya protocol, our results were presented to the Université Antilles/Guyane in French Guiana during a seminar by VL. VL and VD also produced a popularization movie and performed several talks in public school from French Guiana, insuring a feed-back from our research on natural resources toward local 650 populations.